Evaluation and estimation of compressive strength of concrete masonry prism using gradient boosting algorithm

The compressive strength (CS) of the hollow concrete masonry prism is known as an important parameter for designing masonry structures. In general, the CS is determined using laboratory tests, however, laboratory tests are time-consuming and high-cost. Thus, it is necessary to evaluate and estimate the CS using different methods, for example, machine learning techniques. This study employed Gradient Boosting (GB) to evaluate and predict the CS of hollow masonry prism. The database consists of 102 hollow concrete specimens taken from different previous published literature used for modeling. The output is the CS of the hollow masonry prism, while the inputs include the compressive strength of mortar (fm), the compressive strength of blocks (fb), height-to-thickness ratio (h/t), the ratio of fm/fb. To reduce the overfitting problem, this study used K-Fold cross-validation, then particle swarm optimization (PSO) was employed to obtain the optimum hyperparameter. The GB model then was modeled using the optimum hyperparameters. The results showed that the GB model performed very well in evaluating and predicting the CS of the hollow masonry prims with a high prediction accuracy, the values of R2, RMSE, MAE, and MAPE are 0.977, 0.803 MPa, 0.612 MPa, and 0.036%, respectively. The performance of the GB model in this study outperformed in comparison to six different machine learning models (decision tree, linear regression, random forest regression, ridge regression, Artificial Neural network, and Extreme Gradient Boosting) used in previous studies. The results of sensitivity analysis using SHAP and PDP-2D indicate that the CS is strongly dependent on the fb (with a mean SHAP value of 3.2), h/t (with a mean SHAP value of 1.63), while the fm/fb (with a mean SHAP value of 0.57) had a small effect on the CS. Thus, it can be stated that this research provides a good method to evaluate and predict the CS of the hollow masonry prism, which can bring good knowledge for practical application in this field.


Introduction
The compressive strength (CS) of hollow concrete masonry is known as one of the most vital mechanical factors, which strongly affects the technical and economical aspects of the masonry structures [1,2].The CS is known as one of the main mechanical factors that remarkably affect the safety and economy of the structures [3].However, it is quite challenging to determine the CS due to the complex components of masonry structures [4,5].For example, masonry structures include unreinforced and reinforced hollow interlocking compressed stabilized earth, braced frames incorporating masonry infills, steel reinforced grout composites and masonry substrate, and masonry concrete structures [6][7][8].Many previous theoretical studies have investigated the strength and behavior of masonry hollow prisms [9][10][11].Besides, some analytical models have been developed on the basis of equilibrium and deformation compatibility equations to estimate the CS.It was reported that the CS can be estimated using Eurocode 6 [12], which considers the compressive strength of each component material such as block and mortar, etc. Besides, many previous studies used empirical models to estimate the CS based on the results from the laboratory [13][14][15][16][17][18][19].
Most previous studies mainly considered the compressive strength of masonry unit (f b ) and compressive strength of mortar (f m ), however, the CS of hollow blocks is also affected by some factors such as the dimension of the prism and mortar [15].To consider other factors, the previous study established the empirical model to predict the CS using VF b (volume fraction of masonry unit), H p /B (height of masonry prism/width of the masonry unit), and VR mH (volume ratio of bed joints to mortar) as the input variables [15].The results of the previous study indicated that the proposed models performed well in predicting the CS with a determination coefficient R 2 = 0.88.Nevertheless, this previous study was only applied to a specific masonry unit, the dimension of the masonry unit, and the thickness of the mortar joint.Furthermore, it was indicated that it is difficult to use the empirical model to estimate the CS with complex variables [20].
Recently, machine learning (ML) techniques have been applied popularly in estimating problems in civil engineering, particularly in concrete [21][22][23][24][25]. Besides, different machine learning techniques such as Gene Expression Programming (GEP), Adaptive Neuro-Fuzzy Inference System (ANFIS), and multiple linear regression (MLR) to estimate the compressive, flexural strength, and maximum deflection of concrete and reinforced concrete panels [26][27][28].ML techniques are known as good solutions to predict the properties of materials because they can deal with problems having multiple variables.Although many previous studies used ML techniques to forecast the compressive strength of concrete, only few studies have been implemented to estimate the CS of the hollow concrete masonry prism using ML techniques.A previous study used artificial neural networks (ANN) and adaptive neuro-fuzzy inference systems (ANFIS) to estimate the CS of hollow concrete masonry prisms.The previous study used three main variables as the input parameters, including the height-to-thickness ratio, mortar compressive strength, and unit compressive strength.The results of the previous study indicated that the proposed models gave a high prediction accuracy with small error rates [29].
Previous studies used empirical models to predict compressive strength, and it was stated that those empirical models strongly depended on the height-to-thickness ratio of prisms, mortar compressive strength, and compressive strength of blocks.Nevertheless, these empirical models need to be re-assessed when new test results are updated [29].A previous study developed a cellular automated model to estimate the cracking patterns of vertically loaded masonry wallets [30].Garzo ´n-Roca et al. used ANN and fuzzy logic models to predict the compressive strength of brick masonry [31,32].Besides, the ANN model was used to estimate the masonry failure surface subjected to biaxial compressive stress [33].Asteris et al. [34] used a backpropagation artificial neural network to estimate the compressive strength of brick masonry, and the results indicated that the proposed models fitted well with the experimental result.Some previous studies used ML techniques namely fuzzy set, fuzzy logic, and neural network to predict the compressive strength of masonry hollow blocks [35,36].The data used for modeling was attained from the experimental works, including both direct tests and nondestructive tests (rebound hammer and ultrasonic pulse velocity) in order to estimate the compressive strength of masonry.The results of the sensitivity analysis of the previous study revealed that the masonry unit's compressive strength is the most influential factor affecting the masonry compressive strength [37].Besides, Fakharian et al. [38] also indicated that the most vital parameter affecting the CS prediction of hollow concrete prisms is the compressive strength of concrete blocks.The results show that the ANN model achieved the best performance in predicting the CS with the value of R = 0.950 and the value of 6.92%.The prediction of the CS using ANN and ANFIS was conducted with 66 datasets obtained from literature, in which the input variables include f m, f b, and H p /B [39].The results indicated that the ANFIS performed well in predicting the CS with a mean ratio between the actual value to the estimated value of 0.983.Furthermore, a previous study employed experimental datasets and used different ML techniques such as ANN, random forest regression, and XGBoost to forecast the CS.In that study, in addition to f b , f m , other input variables such as the dimension of masonry and unit material were also considered.The results showed that ANN models achieved the highest prediction accuracy with R 2 = 0.95 and RMSE = 1.83 MPa.However, almost previous studies have not completed evaluating the influence of each variable on the CS prediction.
Gradient boosting (GB) is based on the boosting technique, which is considered as an ensemble technique proposed by Friedman [40].The GB algorithm is built by incorporating weak learners into better learners via an iterative sequence [41].As a result, the performance of the GB algorithm could be strengthened, leading to a reduction in the total error and the loss of the model [42].Previous studies stated that the GB technique has a basic advantage in preventing overfitting problems and this technique uses fewer computational resources [42,43].The GB algorithm has been employed in many studies for predicting the compressive strength of concrete [25,44].A previous study was conducted to compare the prediction performance between GB, SVM, and random forest models in predicting genomic breeding values, and it was concluded that the GB algorithm outperformed other algorithms [45].Six different machine learning techniques, consisting of GB were used to estimate the stability of the open stop-hanging wall, it was concluded that among the six ML techniques, the GB technique outperformed than other ones [46].The GB technique was also used to estimate the structure damage caused by blasting vibration, and it was found that the GB model obtained a high accuracy in predicting the structure damage [47].
Based on the above literature and the outstanding performance of the GB algorithm, this research aims to employ the Gradient Boosting (GB) technique to predict and evaluate the CS of the hollow concrete masonry prism.As aforementioned, previous studies used different models to estimate the CS, however, the results showed a low prediction accuracy.Therefore, to improve the high accuracy of the proposed model, the K-Fold Cross-Validation and the hyper-parameters tuning process were conducted by Particle Swarm Optimization (PSO) in this study.Furthermore, the influence of each input variable as well as coupled input parameters on the compressive strength was evaluated using sensitivity analysis via Partial dependence plots (PDP-2D) and Shapley Additive exPlanations (SHAP).

Research significance
The performance of hollow concrete masonry blocks is generally governed by compressive strength (CS).The CS is known as the most important parameter of masonry structures.In general, the CS of hollow concrete masonry blocks is determined by the experiment test in the laboratory, which consumes time and cost.Besides, due to the complex components of masonry structures, the determination of the CS is a challenging task.Thus, many theoretical and empirical approaches have been developed to estimate the CS, but it was indicated that it is difficult to use the empirical model to predict the CS with complex variables.Therefore, recently there have been several studies using machine learning (ML) techniques to forecast the CS of the hollow concrete masonry block.Nevertheless, previous studies did not have a high prediction accuracy.Based on those remaining issues in previous studies, this study used the GB model, which is known as a basic advantage in preventing overfitting problems.Furthermore, to enhance the prediction accuracy of the model, the K-Fold cross-validation with 10 iterations was applied.Besides, optimum hyper-parameters for the proposed model were obtained using the PSO.The study highlights the significance of different parameters that directly influence the compressive strength of the hollow concrete masonry block.The outcome of this study is to provide actionable knowledge, which could support the design of the engineer in the practical application of hollow concrete masonry prisms.

Data collection and description of the database
In this study, 102 hollow concrete specimens were derived from the literature [29,44,[48][49][50][51][52][53][54][55].The database includes three input variables, including the compressive strength of mortar (f m ), the compressive strength of blocks (f b ), height-to-thickness ratio (h/t), the ratio of f m /f b and the output is the compressive strength of prisms (CS).The geometry and input variables influencing the CS of the hollow concrete are shown in Fig 1 .From Table 1, it can be seen that the values of f m range from 4.   The data distribution among input variables and output are presented in The correlation matrix between input and output variables is shown in Fig 4 .Among the input variables, the greatest correlation (R = 0.84) was found between f m and f m /f b , while the lowest was achieved between f m and f b (R = 0.04).The highest correlation between input and output was achieved for the CS and f b (R = 0.79).From this result, it can be said that the relationships between input and output as well as among inputs are not high, thus, these variables should be considered in the modeling.

Gradient boosting (GB)
Gradient boosting algorithm (GB), similar to the random forest, which is known as an ensemble technique firstly suggested by Friedman [40].The fundamentals of the GB algorithm are on the basis of the boosting technique.The GB algorithm is established by combining the weak learner into a stronger learner in an iterative sequence [41].The incorporation of different predictors from individual iterations could strengthen the performance of the model.In addition, in the GB algorithm, the total error and the loss of the model could be reduced [42].As a result, the overfitting problem in the GB algorithm can be significantly decreased.In the GB technique, the weak learner is the regression tree, and in each iteration, the model was trained using stochastic gradient descent to lessen the error.The first weak learner (first tree) was learned in the GB algorithm to reduce the error in the first repetition.Then, the second tree (i.e.second weak leaner) was continuously trained and learned to minimize the error in the second iteration of the second tree.This process is conducted repeatedly until the desired error can be obtained.The framework of the GB algorithm is presented in Fig 5.

Evaluation of model performance
To evaluate the performance of the proposed models, different common indicators including determination coefficient (R 2 ), root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), were employed.The value of R 2 varies from 0 to 1, the higher value of R 2 indicates better performance of the model.In contrast, for three remaining indicators, the higher value of these indicators means the model has a low prediction, in contrast, the low values of these indicators indicate the model achieves high prediction accuracy.These indicators can be computed using the following equations: RMSE ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where, T act h and V est h are the experimental (actual) and estimated h th value, respectively; N is the total number of samples in a dataset and T avg is the mean of the experimental values.

K-Fold cross validation
K-Fold Cross-Validation (CV) was employed to prevent the overfitting problem in this study [56][57][58].The performance of the model was improved when the K-Fold cross-validation is used because the bias can be eliminated due to the random selection of training and testing from datasets.The original dataset was split into two parts, the first part consists of 70% of data is used for training, while the remaining part is used for testing.The K-Fold CV is only applied for the training dataset in this research.The training dataset was then divided into groups by K-Fold CV, one group for training and the remaining for testing.Through K-Fold cross-validation, 10 iterations were applied.For each iteration, by randomly divided the training dataset, the individual subset is generated, in which one subset is used for testing while 09 remaining subsets are used to train the model.The K-Fold CV was conducted 10 times to improve the performance of the model, and then the optimal model was achieved with the highest performance value such as maximal coefficient of determination R 2 , minimal Root Mean Square Error (RMSE) or minimal Mean Absolute Error (MAE).In fact, the cost function of tuning hyperparameters process (see later in Fig 7) is considered to be coefficient of determination R 2 in this study.Fig 6 shows the detail structure of K-Fold CV, the K-Fold CV technique with K = 10 is expressed as follows: Step 1: The training dataset was split into ten subsets, in which 09 subsets are used for training model and one remaining subset is used for testing.
Step 2: In each iteration, the model was built based on the training subset then the model was verified using testing subset.
Step 3: The performance value of ML model as coefficient of determination R 2 in each iteration and the mean value R 2 of all subsets after reiterating were calculated.
Step 4: The model then was validated and based on the mean error after 10 times iteration, the optimal model was selected.Step 1: Dataset preparation.masonry concrete prisms, which were taken from the literature.After that, the dataset was divided into 2 categories, the first category (70% dataset) was used for training, while the remaining dataset (30% dataset) was used for validating the performance of the proposed model.
Step 2: Training and validating the model.In this step, the proposed model was trained using a training dataset.In the training process, the GB algorithm was used to train the dataset.The hyperparameters of the GB model were obtained using Particle swarm optimization (PSO) considering coefficient of determination R 2 score.The highest score R 2 of tuning hyperparameters process was computed using K-Fold Cross-Validation with K = 10 iterations to prevent the overfitting issue.
In this step, the accuracy of the model was assessed via four indicators, including, R 2 , MAE, RMSE, and MAPE for testing dataset.
Step 3: Evaluating the feature's importance The optimal model was achieved based on the results of evaluation and validation in step 2. The SHAP was used to evaluate the importance of each input variable on the compressive strength of the hollow masonry prism.In addition, the importance of each and coupled input variable on the prediction accuracy of the output (the CS) was implemented using partial dependence plot 2D (PDP-2D).Finally, the estimated and actual values of the CS were compared to know the accuracy of the proposed model.

Model hyperparamter tuning
To achieve the optimum hyperparameter of the proposed model, the Particle swarm optimization (PSO) was employed, and the performance index in the tuning process was estimated using R 2 indicator.The hyper-parameter space of the proposed GB model used for tunning is presented in Table 2.   .The optimum hyperparameters were obtained from the optimization analysis using PSO and these optimum hyperparameters are listed in Table 2.

Prediction of the compressive strength of the prism using GB model
Fig 9 presents the comparison between the predicted and actual results of the GB model for the compressive strength of the prism for both training and testing datasets.For the training dataset, the predicted values match well with the actual values, in which the values of R 2 , RMSE, MAE, and MAPE are 0.977, 0.803 MPa, 0.612 MPa (minimal absolute error 0.011 MPa and maximal absolute error 2.243 MPa), and 0.036%, respectively.For the testing dataset, the predicted and actual values are almost the same, the values of R 2 , RMSE, MAE, and MAPE are 0.976, 0.906 MPa, 0.728 MPa (minimal absolute error 0.002 MPa and maximal absolute error 2.177 MPa), and 0.048%, respectively.The results of R 2 in this study are higher than those obtained from previous studies [20,29,38,59].For example, the previous study performed by Fakharian et al. [38] used ANN, Gene Expression Programming (GEP), and Group Method of Data Handling (GMDH), to predict the CS of hollow concrete masonry blocks, the values of R 2 range from 0.825 to 0.903.Besides, the previous study used six different machine learning models (decision tree, linear regression, random forest regression, ridge regression, ANN, and XGBoost), and the results indicated that the value of R 2 ranges from 0.558 to 0.950.The better prediction accuracy of this study compared with previous studies can be due to the reduction of the overfitting problem.As aforementioned, the GB technique has a basic advantage in preventing overfitting problems and the K-Fold cross-validation and optimum hyperparameters were used in this study.This indicates that the proposed GB model used in the present study yielded a higher prediction accuracy in comparison with results obtained in previous works.
The comparison between the predicted compressive strength and actual compressive strength is shown in Fig 10 .It can be observed that the predicted and actual values match together well.Almost data points are located in the range of 20% error, there is only one point of the testing dataset that lies on the line y = ±1.2x.In the previous study that used ANN, GMDH, and GEP, the results showed that the predicted and experimental values did not match well [38].The results of the previous study showed that both training and testing were located outside the range of 20% error.Furthermore, another previous study [29] used different methods including CSA S304.1-04 (Canadian Standards Association), Eurocode 6, ANN, ANFIS, and the method proposed by Sarhat et al. [59] to predict the compressive strength of the hollow masonry prims, the results showed large scatter values, which also locate out the line y = ±1.2x.Furthermore, a previous study used different machine learning models namely linear regression, decision tree, random forest, Ridge regression, ANN, and XGBoost showed that all models had a large number of datasets located out of the 20% error envelope [20].The percentage of datasets out of 20% error range varied from 29 to 72%.The result of this study indicates that the proposed model performed better than other models in the previous studies and the models in this study also achieved a high accuracy in predicting the compressive strength of the hollow masonry prism.In detail, for the training dataset, the predicted value is slightly higher than the actual value, however, the standard deviation of the predicted value is smaller than that of the actual value.Similar to the training dataset, for the testing dataset, the predicted values are also larger than the actual value, and the standard deviation of predicted values is also smaller than that of actual values.
Moreover, to ensure the usefulness the result of current manuscript, an excel calculation file including the complex equation generated from the GB model is also added in the revised manuscript (cf.S1 Data).This file facilitates a more streamlined process for engineers in designing compressive strength of hollow concrete masonry.

Feature importance analysis
The feature importance analysis of the CS of the prism is shown in Fig 12 .The f b is the most important variable influencing the CS of the hollow masonry prism.This is consistent with the results in previous works on predicting the CS of masonry prism using ML models [20,38].The second vital factor is found for the h/t, and the f m is considered as the third vital parameter.The least important variable is the f m /f b .
To understand the importance of each input variable, SHapley Additive exPlanation (SHAP) is used to calculate the contribution of each input variable [60,61].The SHAP value is the mean marginal effect of individual variables via all possible combinations of input parameters.The importance of each input variable can be computed from the absolute SHAP value.The variables with a high absolute SHAP value are considered as important variables.The global feature importance is achieved by calculating the absolute SHAP value for all Regarding the f b value, it can be observed that the higher value of f b results in a greater SHAP value, and this result is consistent with the results obtained in previous studies for both experimental and modeling approaches [20,38].The second crucial variable that strongly influences the CS of hollow masonry prisms is h/t with a mean absolute SHAP value of 1.63.From Fig 12 , it can be seen that when the values of h/t increase, the lower SHAP value of h/t results in a higher value of feature importance.These results are also consistent with the results found in the previous work [20].From these results of SHAP, it can be concluded that using SHAP could improve the understanding of the proposed model and also prove that the prediction accuracy of the proposed model is acceptable.As a result, it can be implied that SHAP can enhance the prediction ability of ML models.
It is known that among input variables, the input variables have a mutual effect together, and these mutual influences greatly affect the CS of the hollow concrete prism.Thus, the partial dependence plot of two variables (PDP-2D) was shown to evaluate the influence of coupled variables on the compressive strength of the hollow concrete specimens as well as the mutual interaction between variables (Fig 13).It can be seen that with the value of f b less than 21.19 MPa, the varies in f m value have an insignificant influence on the compressive strength of the hollow concrete specimens (Fig 13A).However, when the values of f b are greater than 21.19 MPa, the higher value of f m significantly influences the CS of the hollow concrete specimens.This result is consistent with the results found in the previous studies [20].b, c, it can be seen that the values of f b greatly affect the CS of the hollow concrete specimens, the CS increases with the increasing f b values.This result is in agreement with the results obtained in the previous studies [20,38].These mutual effects of f m /f b and h/t on the CS of the hollow concrete specimens are presented in Fig 13D .It can be observed that f m /f b has a small influence on the CS with the change of h/t; while h/t values affect the CS of the hollow concrete specimens.The smaller values of h/t have a higher influence on the CS of the hollow concrete specimens.Based on the results of the feature importance analysis, it can be summarized that the f b is the most variable that influences the CS of the hollow masonry concrete block.

Conclusions
This study evaluated and predicted the compressive strength (CS) of the hollow masonry prism using the Gradient Boosting model.102 datasets taken from the literature were used for modeling.The input variables include the compressive strength of mortar (f m ), the compressive strength of blocks (f b ), the height-to-thickness ratio (h/t), the ratio of f m /f b , whereas, the output is the CS of the hollow concrete masonry block.The optimum hyperparameters of the model were obtained using PSO.The importances of individual and coupled input variables on the CS of the prims were evaluated using SHAP and PDP-2D.
• The GB model performed very well in evaluating and predicting the CS of the hollow masonry concrete blocks with a high prediction accuracy, the values of R 2 , RMSE, MAE, and MAPE are 0.977, 0.803 MPa, 0.612 MPa, and 0.036%, respectively.
• The prediction accuracy of the proposed GB model in this research is higher than other different machine learning models (such as decision tree, linear regression, random forest regression, ridge regression, ANN, and XGBoost) used in previous studies.
• The results of sensitivity analysis using SHAP and PDP-2D indicate that the compressive strength of blocks (f b ) is the most dominant factor affecting the CS of the blocks.The second variable that strongly influences the CS of the prism is h/t ratio, while f m /f b is the least important variable affecting the CS of the prism.
• From the result of this study, it can be concluded the proposed GB model provides a good method to evaluate and predict the CS of the hollow concrete masonry prism, which can bring valuable knowledge for design and practical application in this field.
The findings of this study indicated that the GB model has a good technique for predicting the compressive strength of the hollow masonry prims, which could support the design of the engineer in the practical application.However, the data employed in this study was sourced exclusively from several references with a medium data size.Future research should incorporate data from a broader range of sources to enhance the accuracy and generalizability of the predictive models.
6 to 22.8 MPa, while the f b values vary from 15.4 to 40.5 MPa.The h/ t and f m /f b vary from 1.8 to 4.3 and from 0.19 to 1.48, respectively.The values of output CS are in the range of 11.3 to 27 MPa.The details of input variables and output can be found in Table1.The magnitude of input variables and output are shown in Fig 2.It can be observed that the compressive strength of standard block f b has a medium value of 22 MPa with a range of deviation (from 10 to 39 MPa).The median value of the compressive strength of mortar f m is approximately 13 MPa.The median values of f m /f b and h/t are 0.58 and 2.9, respectively.The compressive strength of masonry prisms has a median value of 18 MPa.

Fig 3 ,
containing f b , f m , f m /f b , h/t, and CS (output).The f b values range from 10 to 40 MPa, and are mostly concentrated at 25 MPa.The f b is mostly independent of f m and h/t, while f b has a negative relation with increasing f m /f b .There is a strong relationship between f b and CS, the f b increases linearly with increasing CS value.Besides, f m increased linearly with f m /f b values, and while there is a large variation between f m and CS, f m slightly increased with increasing CS values.f m /f b ranges from 0 to 1.0 and is mostly located from 0.5 to 1.0.There is a large variation of f m /f b with the change of h/t, the f m /f b slightly increases with increasing in h/t.The f m /f b also largely varies with CS, the f m /f b slightly reduces with increasing in h/t.From Fig3, it can be seen that the values of h/t vary from 2 to 5. h/t has a large scatter with the change of CS values.The values of h/ t slightly reduce with increasing CS values.The values of CS have a range of 10 to 30 MPa and are mostly located around 18 MPa.

Fig 8
shows the surface plot of the R 2 values with different hyperparameter values of the GB model.As shown in the figure, optimum parameter values can be obtained in all cases with a wide range.As shown in Fig 8C, the R 2 values are higher than 0.85 and can be obtained when the learning rate is greater than 0.15 and the number of estimators is larger than 100 iterations.It can be observed that a max feature of 2 and a max depth of 2

Fig 11
Fig9 presents the comparison between the predicted and actual results of the GB model for the compressive strength of the prism for both training and testing datasets.For the training dataset, the predicted values match well with the actual values, in which the values of R 2 , RMSE, MAE, and MAPE are 0.977, 0.803 MPa, 0.612 MPa (minimal absolute error 0.011 MPa and maximal absolute error 2.243 MPa), and 0.036%, respectively.For the testing dataset, the predicted and actual values are almost the same, the values of R 2 , RMSE, MAE, and MAPE are 0.976, 0.906 MPa, 0.728 MPa (minimal absolute error 0.002 MPa and maximal absolute error 2.177 MPa), and 0.048%, respectively.The results of R 2 in this study are higher than those obtained from previous studies[20,29,38,59].For example, the previous study performed byFakharian et al. [38]  used ANN, Gene Expression Programming (GEP), and Group Method of Data Handling (GMDH), to predict the CS of hollow concrete masonry blocks, the values of R 2 range from 0.825 to 0.903.Besides, the previous study used six different machine learning models (decision tree, linear regression, random forest regression, ridge regression, ANN, and XGBoost), and the results indicated that the value of R 2 ranges from 0.558 to 0.950.The better prediction accuracy of this study compared with previous studies can be due to the reduction of the overfitting problem.As aforementioned, the GB technique has a basic advantage in preventing overfitting problems and the K-Fold cross-validation and optimum hyperparameters were used in this study.This indicates that the proposed GB model used in the present study yielded a higher prediction accuracy in comparison with results obtained in previous works.The comparison between the predicted compressive strength and actual compressive strength is shown in Fig 10.It can be observed that the predicted and actual values match together well.Almost data points are located in the range of 20% error, there is only one point of the testing dataset that lies on the line y = ±1.2x.In the previous study that used ANN, GMDH, and GEP, the results showed that the predicted and experimental values did not match well[38].The results of the previous study showed that both training and testing were located outside the range of 20% error.Furthermore, another previous study[29] used different methods including CSA S304.1-04 (Canadian Standards Association), Eurocode 6, ANN, ANFIS, and the method proposed by Sarhat et al.[59] to predict the compressive strength of the hollow masonry prims, the results showed large scatter values, which also locate out the line y = ±1.2x.Furthermore, a previous study used different machine learning models namely linear regression, decision tree, random forest, Ridge regression, ANN, and XGBoost showed that all models had a large number of datasets located out of the 20% error envelope[20].The percentage of datasets out of 20% error range varied from 29 to 72%.The result of this study indicates that the proposed model performed better than other models in the previous studies and the models in this study also achieved a high accuracy in predicting the compressive strength of the hollow masonry prism.Fig 11 shows the values of the CS for both training and testing.From Fig 11, it can be seen that for both the train and test datasets, the predicted values have almost the same as the value of true values (actual values).In detail, for the training dataset, the predicted value is slightly higher than the actual value, however, the standard deviation of the predicted value is smaller than that of the actual value.Similar to the training dataset, for the testing dataset, the predicted values are also larger than the actual value, and the standard deviation of predicted values is also smaller than that of actual values.Moreover, to ensure the usefulness the result of current manuscript, an excel calculation file including the complex equation generated from the GB model is also added in the revised manuscript (cf.S1 Data).This file facilitates a more streamlined process for engineers in designing compressive strength of hollow concrete masonry.

Fig 11 .
Fig 11.The values of compressive strength in training and testing.https://doi.org/10.1371/journal.pone.0297364.g011 Fig 13B shows that the values of h/t have little influence on the CS of the hollow concrete specimens regardless f b values.The compressive strength slightly increases with the decrease in h/t value.Similarly, in the case of Fig 13B, the compressive strength of the hollow concrete specimens is almost independent of the value of f m /f b with the change of f b value.From Fig 13, a,